Evaluation of efficiency and sensitivity of 1D and 2D sample pooling strategies for SARS-CoV-2 RT-qPCR screening purposes

To increase the throughput, lower the cost, and save scarce test reagents, laboratories can pool patient samples before SARS-CoV-2 RT-qPCR testing. While different sample pooling methods have been proposed and effectively implemented in some laboratories, no systematic and large-scale evaluations exist using real-life quantitative data gathered throughout the different epidemiological stages. Here, we use anonymous data from 9673 positive cases to model, simulate and compare 1D and 2D pooling strategies. We show that the optimal choice of pooling method and pool size is an intricate decision with a testing population-dependent efficiency-sensitivity trade-off and present an online tool to provide the reader with custom real-time 1D pooling strategy recommendations.

www.nature.com/scientificreports/ We questioned to what extent optimal pooling strategies would have changed throughout the COVID-19 pandemic and how testing facilities might use pooling strategies for future testing in a correct and attainable manner. To this extent, we simulated and evaluated one-dimensional (1D) and two-dimensional (2D) pooling strategies with different pool sizes using real-life RT-qPCR data gathered by the Belgian national testing platform during the end of the first and the beginning of the second SARS-CoV-2 epidemiological waves.

Materials and methods
Patient samples. Nasopharyngeal swabs were collected in VTM or DNA/RNA Shield (Zymo Research) by a healthcare professional as a diagnostic test for SARS-CoV-2, as part of the Belgian national testing platform. The individuals were tested at nursing homes or in triage centers, between April 9th and June 7th, and between September 1st and November 10th. After filtering the data as described further, this resulted in 207,944 patients in total, of which 9673 positives (4.65%).

SARS-CoV-2 RT-qPCR test.
During the first (spring) wave, RNA extraction was performed using the Total RNA Purification Kit (Norgen Biotek #24300) according to the manufacturer's instructions using 200 µl transport medium, 200 µl lysis buffer and 200 µl ethanol, with processing using a centrifuge (5810R with rotor A-4-81, both from Eppendorf). RNA was eluted from the plates using 50 µl elution buffer (nuclease-free water), resulting in approximately 45 µl eluate. RNA extractions were simultaneously performed for 94 patient samples and 2 negative controls (nuclease-free water). After addition of the lysis buffer, 4 µl of a proprietary 700 nucleotides spike-in control RNA (prior to May 25th, 40 000 copies for singleplex RT-qPCR; from May 25th onwards, 5000 copies for duplex RT-qPCR) and carrier RNA (200 ng of yeast tRNA, Roche #10109517001) was added to all 96 wells from the plate. To the eluate of one of the negative control wells, 7500 RNA copies of positive control RNA (Synthetic SARS-CoV-2 RNA Control 2, Twist Biosciences #102024) were added. During the second (autumn) wave, RNA extraction was performed using the Quick-RNA Viral 96 Kit (Zymo Research #R1041), according to the manufacturer's instructions using 100 µl transport medium, with processing using a centrifuge (5810R with rotor A-4-81, both from Eppendorf). RNA was eluted from the plates using 30 µl elution buffer (nuclease-free water). RNA extractions were simultaneously performed for 92 patient samples, 2 negative controls (nuclease-free water), and 2 positive controls (1 diluted positive case as a full workflow control; 1 positive control RNA as RT-qPCR control, see further). After addition of the lysis buffer, 4 µl of a proprietary 700 nucleotides spike-in control RNA (5000 copies) and carrier RNA (200 ng of yeast tRNA, Roche #10109517001) was added to all 96 wells from the plate. To the eluate of one of the negative control wells, 7500 RNA copies of positive control RNA (Synthetic SARS-CoV-2 RNA Control 2, Twist Biosciences #102024) were added.
Six µl of RNA eluate was used as input for a 20 µl RT-qPCR reaction in a CFX384 qPCR instrument using 10 µl iTaq one-step RT-qPCR mastermix (Bio-Rad #1725141) according to the manufacturer's instructions, using 250 nM final concentration of primers and 400 nM of hydrolysis probe. Primers and probes were synthesized by Integrated DNA Technologies using clean-room GMP production. For detection of the SARS-CoV-2 virus, the Charité E gene assay was used (FAM) 18 ; for the internal control, a proprietary hydrolysis probe assay (HEX)   The sensitivity is calculated as: The analytical efficiency gain is calculated as: In all simulations, the number of tests required for individual testing is equal to the number of samples (assuming no technical failures). The outcomes for each simulation were identical as the sample size far outreached the size of the dataset. The code is available at https:// github. com/ OncoR NALab/ covid pooli ng.
Ad hoc sensitivity and efficiency calculation. To calculate the efficiency for a specific 1D pooling strategy on a real sample set, the following equation was used: With sample size n , pool size s , fraction of positive samples p and fraction of Cq values of positive samples above the 'dilution detection limit': the lowest individual Cq value that can result in a pooled Cq value lower than the single molecule Cq value, or: Equation (4) is derived as follows. The efficiency is defined by the following equation:  www.nature.com/scientificreports/ The number of tests performed when using a pooling strategy is equal to: The exact number of positive pools can be calculated by multiplying the number of pools by the probability of a pool testing positive. Approximately, a pool will test positive if it includes a positive sample with a Cq value lower than the 'dilution detection limit' . The probability of having a specific number of positive samples k in a pool with pool size s is defined by a binomial distribution: Thus, the probability of having at least one positive value in a pool is equal to: In general, we can assume that when a sample has a Cq value higher than the 'dilution detection limit' , for the sample to test positive, it must be accompanied by a sample with a Cq value lower than the 'dilution detection limit' . Equation (10) can be adjusted to factor for these events: Filling in Eq. (11) in Eq. (9) results in the final formula being used for the calculation of the efficiency. To estimate the sensitivity for a specific 1D pooling strategy on a real sample set, the following equation was used: The sensitivity can be defined as the probability a true positive sample tests positive. For our situation it will be equal to the probability that any sample tests positive: Previously, P Cq ≥ cut off was defined as c and therefore P Cq < cut off = 1 − c . Also P pos test|Cq < cut off = 1 . A positive sample with Cq value above the 'dilution detection limit' can only test positive if one of the other samples in the pool is also positive and has a Cq value lower than the 'dilution detection limit' . We can calculate the probability of this happening by using the same logic as before, but with s − 1 instead of s: Completing Eq. (13) with Eq. (14) leads to Eq. (12) for calculating the sensitivity.

Web application.
To help laboratories find the best pooling strategy for their specific situation (i.e. the local positivity ratio and Cq value distribution), we developed a Shiny application in R 4.0.1. The Shiny application was launched on our in-house Shiny server and is available at https:// shiny. dev. cmgg. be/.
Ethical declarations. Our analyses have been approved under EC/071-2020 by the Ghent University Hospital ethical committee. No biological samples were used in this study and the analyses were based solely on anonymized data gathered by the Belgian national testing platform. The need for an informed consent was waived by the ethics committee.

Single-molecule Cq value determination.
To infer the single-molecule Cq value (specific to our laboratory set-up), we made a 5-point tenfold serial dilution series of positive control RNA from 150,000 (digital PCR calibrated) copies down to 15 copies. The Y-intercept value of the linear regression points at a single-molecule Cq value of 35.66 and 35.28 for singleplex and duplex RT-qPCR (which are the two methods that were used, see "Materials and methods"), respectively (Supplemental Fig. 1). Therefore, we conservatively use 37 as the (6) efficiency = n no. tests required for pooling strategy (7) no. tests required for pooling strategy = no. pools + no. positive pools · s (8) no. tests required for pooling strategy = n s + no. positive pools · s P pos test = P pos test|Cq ≥ cut off ·P Cq ≥ cut off +P pos test|Cq < cut off ·P Cq < cut off www.nature.com/scientificreports/ single-molecule value for further analysis. Patient sample Cq values higher than the single-molecule Cq value threshold are likely due to random measurement variation, lot reagent variability and sample inhibition. Note that this cut off is highly dependent on the experimental procedure and reagents used and is, therefore, specific to our experimental setting.
Cq distribution is dynamic over course of the pandemic. Few studies have explored how the Cq value distribution within one testing facility evolves during the COVID-19 pandemic. We determined the 75%tile of the Cq value distribution (i.e. 75% of the data have a lower Cq value) and the percentage of positive tests per day as a proxy for actual Cq value distribution and prevalence, respectively (Fig. 2). We compared the fraction of positive tests in our dataset with the fraction of positive tests as reported by the federal agency for public health Sciensano (https:// epist at. wiv-isp. be/ covid/. accessed January 25th, 2021). First, the fractions of positive tests seem to align at the end of the first wave, but in the second wave our data seems to be shifted about 1-2 weeks later. Second, the 75%-tile of the Cq values varies over the course of the pandemic from a minimum value of around 18 and a maximum value of almost 35. Third, when comparing the fraction of positive samples and the 75%-tile of the Cq value distribution, we note that these parameters are inversely related: when the positivity rate goes down, the Cq value distribution shifts towards the higher end of the spectrum (Supplemental www.nature.com/scientificreports/ wave. The data were grouped by week and the resulting Cq value distributions and positivity rates were used as input for the simulations (Fig. 3). First, sensitivity and efficiency show very opposing patterns when comparing different timeframes during the pandemic. At the end of the first wave the efficiency increases, while at the beginning of the second wave, the efficiency decreases. The sensitivity drops as we move further away from the first wave but remains stable as we enter the second. Second, pool size and strategy have a major influence on the outcomes. 2D pooling strategies generally have the highest efficiency, but the lowest sensitivity. Curiously, strategies with larger pool sizes were more efficient during the end of the first wave, but less efficient during the beginning of the second wave. The sensitivity was always higher for strategies with smaller pool sizes, irrespective of the time during the pandemic. We conclude that-just like the positivity rate and the Cq value distribution-the sensitivity and efficiency depend on the timing in the pandemic and are heavily affected by the pooling strategy and the size of the pools.
Positivity rate drives efficiency, Cq distribution drives sensitivity. We wondered how the positivity rate, Cq value distribution and pooling strategy affect the performance of the adopted strategy. To investigate this, we used the previous simulations for the end of the first wave to create an adjusted visualization where all parameters involved are incorporated (Fig. 4). First, it is apparent that weeks with a high 75%-tile Cq value tend to have a low sensitivity and weeks with a high positivity rate seem to have a low efficiency. Second, pooling strategies with smaller pool sizes seem less sensitive to changes in positivity rate and Cq value distribution, as indicated by the area of the polygon traced around the edges of the data (Fig. 4). These results show that the prevalence mainly contributes to the efficiency and the Cq distribution to the sensitivity.

Shiny app for guided decision making.
To provide laboratories with a custom pooling strategy recommendation based on their specific sampling population, we worked out equations to estimate the sensitivity and efficiency (for 1D pooling strategies) based on an uploaded dataset of Cq values. The derivation of these equations can be found in "Materials and methods" section. We focused on 1D pooling strategies since 2D pooling strategies generally resulted in extreme outcomes (highest efficiency and lowest sensitivity) and the outcomes of the optimal pooling strategy are situated somewhere in two extremes. Thus, we provide no tool for 2D pooling  www.nature.com/scientificreports/ methods. To evaluate the equations' capacities to replicate the simulations, we compared the simulated efficiency and sensitivity of the pooling strategies for the different weeks and the efficiency and sensitivity of the pooling strategies the distributions, fraction of positive samples and single-molecule cut off as inputs for the formulas (Supplemental Figs. 3, 4). We integrated these formulas into an open-access Shiny application (Supplemental Fig. 5). The application provides a step-by-step guide to completing the input variables and parameters. A Frequently Asked Questions (FAQ) section answers some questions on the workings and general idea of the application. The tool requires four inputs: a dataset of Cq values from positive samples, the positivity rate, the singlemolecule cut off Cq value and a range of pool sizes of interest. The Shiny application will then swiftly output the estimated data-specific efficiency and sensitivity for different pooling strategies. The application shows a graphic representation of the dataset-specific change in sensitivity (green) and efficiency (black) for the requested range of pool sizes. The output also indicates the pool sizes resulting in maximum efficiency and sensitivity. The tool can be easily updated if comments or suggestions arise.

Discussion
Using a sizeable real-life dataset of 9673 SARS-CoV-2 positive nasopharyngeal samples, we found that the pooling strategies' sensitivity and efficiency mainly depend on the prevalence and the distribution of the Cq values.
Our results indicate that both the prevalence and the Cq value distribution are dynamic parameters during the SARS-CoV-2 pandemic and that, as a result, the resulting sensitivity and efficiency of pooling strategies are as well. To enable researchers and institutions with a real-time and accessible recommendation concerning the optimal 1D pooling strategy for their testing population, we developed a Shiny app providing just that. Two factors could explain the dynamics of the prevalence and the Cq value distribution: epidemiological and virological change within the same sampling population and variation in the sampling population. The existence of these factors would suggest that an intricate interplay of these two components is at the origin of the observed evolutions. Recent research indicated that the first component (epidemiological change) exists, as the distribution of random surveillance testing-deduced Cq values fluctuates during the SARS-CoV-2 pandemic (by definition, www.nature.com/scientificreports/ no changes in sampling population occurred in this research, thereby excluding this factor from the equation) 20 .
As such, an alarming change in reproduction number (inherently coupled to the epidemiological period) should induce a reassessment of the pooling strategy 21 . The second component (variation in sampling population) is bound to happen when the testing facility is not consistently receiving samples from the same origin, as is the case for Biogazelle. At the very introduction of Biogazelle as a testing facility, most samples originated from hospitals and sources were added progressively as the testing capacity increased. Additionally, the Belgian government instituted a rapid change in the testing regime on October 21st, 2020: only symptomatic suspected SARS-CoV-2 cases get tested. The federal government lifted this measure on November 23rd, 2020, when the number of cases lowered and the existing testing capacity sufficed again. Since symptomatic patients generally show lower Cq values 22,23 , it is clear that sampling bias will contribute to the overall Cq value distribution. The influence these dynamic parameters have on the variation of performance of pooling strategies is significant. This observation raises an issue for interpreting pooling strategy evaluations not based on time-series datasets. The effectiveness of a chosen pooling plan might even decrease to such an extent that it becomes inferior to individual testing. We observed this situation at the end of the second wave when efficiency is close to 1, but sensitivity is not (Fig. 3). Based on these results, it becomes essential to regularly re-evaluate an adopted pooling strategy to avoid compromising on sensitivity and efficiency when there is no need.
Multiple effects contribute to how the testing population's characteristics drive pooling strategy outcomes. The main trends show that the prevalence mainly influences efficiency, and the Cq value distribution mainly influences sensitivity (Fig. 3). We can explain both observations by using common sense and basic mathematics. When the prevalence is low, the efficiency is high: fewer pools will have positive samples and therefore test negative, which will automatically result in a lower number of tests needed to test all samples. Additionally, when a considerable proportion of samples have a Cq value close to the single-molecule Cq value, a more significant fraction of samples will become too diluted to detect during pooling and result in false negatives. There appear to be secondary compensating effects of the Cq value distribution and prevalence on the efficiency and sensitivity, respectively, which are more subtle. Primarily, as a higher fraction of positive samples has a Cq value close to the upper limit, more pools will test (false) negative, boosting the efficiency. On the other hand, when the prevalence increases, the sensitivity will increase due to an effect we call 'rescuing': a high Cq value that would otherwise test negative when diluted in the pool is 'rescued' by a low Cq value in the same pool. When the prevalence rises, the chances of this phenomenon happening also increase and as will the sensitivity. The same was observed by Cleary et al. 21 . Although minor, these secondary effects explain several of our observations.
To elaborate how the optimal pooling strategy (best efficiency trade-off) transforms over time, assume two situations: low prevalence and high prevalence. When the prevalence is low, the larger pool sizes will result in higher efficiency and lower prevalence (more dilution). However, when the prevalence is high, the 'rescuing' effect will be more prominent and counteract the increasing efficiency and decreasing sensitivity. These results are in line with the widely accepted idea that sample pooling methods show a higher efficiency when pool size is large and that as prevalence increases, it reached a threshold after which smaller pool sizes become more efficient 4,12 . Intuitively, the 'rescuing' effect is less prominent in 2D pooling strategies, as both pools (row and column) need to rescue the high Cq sample.
False negatives have pre-pool Cq values close to the detection limit and predominantly originate from patients who are at the end of an infection 21,24 , putting their clinical relevance in question (i.e. no longer infectious). Similarly, however, one can argue that these high Cq samples are imperative to a favorable pandemic response: they might originate from pre-symptomatic or very recently-infected patients 21 , allowing for catching cases before transmission-a principle at the very core of every population screening strategy. Also, we cannot rule out that these high Cq values are due to imperfect sampling or any other mistakes along the sample preparation 16 .
The preference of achieved efficiency and sensitivity depends on the viral circulation. When the virus is widely circulating, and many tests are conducted, efficiency will be more critical as missing several positive samples will not have a remarkable effect. On the other hand, when the number of cases is extremely low, sensitivity should be prioritized: missing one sample can initiate a break-out. Disturbingly, screening settings are a combination of both previous settings: the number of tests is high, but the positivity rate is low. Such conditions require a purposely calculated efficiency-sensitivity trade-off.
Our study suffers from some essential limitations. First, although the data grouped by weeks provides many different situations to assess, there will still be other combinations of parameters that we did not analyze in this paper. However, the current dataset probably represents the most plausible scenarios as the data originate from a protracted period of the pandemic. Second, we selected only 1D and 2D pooling methods in this simulation study. As stated before, other pooling regimes exist and might be more performant than the discussed ones. Yet, these pooling strategies come with intrinsic shortcomings. The P-BEST pooling protocol is very time consuming 13 , even when using a pipetting robot, and the repeated pooling method suffers from a complicated re-pooling scheme 4 . Third, our model relies on the critical assumption that we can directly induce the pool's Cq value from the individual samples' Cq values using a simple formula (see "Materials and methods"). Our pooling results are solely based on simulations (while the individual Cq values are experimentally determined) and wet lab experiments have shown that these do not necessarily correspond perfectly [8][9][10][11] . Fourth, to calculate the pooling strategies' performance, the single-molecule Cq value and the prevalence must be known. However, we can easily calculate the single-molecule Cq value by generating a dilution series of a calibrated reference material as done in this study. Of note, this should be done in each lab individually as it depends on the experimental set-up. A testing laboratory can also choose to utilize a cut off Cq value different from the single-molecule Cq value. Once a threshold Cq value is determined, it should not be changed. The prevalence, however, cannot be known precisely, and as a result, the prevalence must be estimated. We can do this either before adopting a pooling strategy by testing the individual samples and using the fraction of positive samples as an indication for the prevalence or when a pooling strategy is already in place by calculating it from the percentage of positive pools 5 www.nature.com/scientificreports/ calculated efficiency gain is merely a representation of the number of individual RNA extractions and RT-qPCR reactions and does not evaluate the amount of labor or time-to-result. Pooling a low number of samples will unnecessarily increase the time-to-result and workload.
In conclusion, we show that finding the optimal pooling strategy for SARS-CoV-2 test samples is guided by a testing population-dependent efficiency-sensitivity trade-off. Consequently, the most favorable pooling regime might change throughout the pandemic due to epidemiological changes and revisions in diagnostic testing strategies. We provide an accessible shiny application to guide readers towards the optimal pooling strategy to fit their needs.

Data availability
The code and Cq value data are available on https:// github. com/ OncoR NALab/ covid pooli ng.